70 research outputs found

    Quality-aware adaptive delivery of multi-view video

    Get PDF
    Advances in video coding and networking technologies have paved the way for the Multi-View Video (MVV) streaming. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D viewing experience may be degraded signifi- cantly, unless quality-aware adaptation methods are deployed. There is no research work to discuss the MVV adaptation of decision strategy or provide a detailed analysis of a dynamic network environment. This work addresses the mentioned issues for MVV streaming over HTTP for emerging multi-view displays. In this research work, the effect of various adaptations of decision strategies are evaluated and, as a result, a new quality-aware adaptation method is designed. The proposed method is benefiting from layer based video coding in such a way that high Quality of Experience (QoE) is maintained in a cost-effective manner. The conducted experimental results on MVV streaming using the proposed strategy are showing that the perceptual 3D video quality, under adverse network conditions, is enhanced significantly as a result of the proposed quality-aware adaptation

    Deep Multi-Critic Network for accelerating Policy Learning in multi-agent environments

    Get PDF
    Humans live among other humans, not in isolation. Therefore, the ability to learn and behave in multi-agent environments is essential for any autonomous system that intends to interact with people. Due to the presence of multiple simultaneous learners in a multi-agent learning environment, the Markov assumption used for single-agent environments is not tenable, necessitating the development of new Policy Learning algorithms. Recent Actor-Critic algorithms proposed for multi-agent environments, such as Multi-Agent Deep Deterministic Policy Gradients and Counterfactual Multi-Agent Policy Gradients, find a way to use the same mathematical framework as single agent environments by augmenting the Critic with extra information. However, this extra information can slow down the learning process and afflict the Critic with Curse of Dimensionality. To combat this, we propose a novel Deep Neural Network configuration called Deep Multi-Critic Network. This architecture works by taking a weighted sum over the outputs of multiple critic networks of varying complexity and size. The configuration was tested on data collected from a real-world multi-agent environment. The results illustrate that by using Deep Multi-Critic Network, less data is needed to reach the same level of performance as when not using the configuration. This suggests that as the configuration learns faster from less data, then the Critic may be able to learn Q-values faster, accelerating Actor training as well

    Analysis by synthesis spatial audio coding

    Get PDF
    This study presents a novel spatial audio coding (SAC) technique, called analysis by synthesis SAC (AbS-SAC), with a capability of minimising signal distortion introduced during the encoding processes. The reverse one-to-two (R-OTT), a module applied in the MPEG Surround to down-mix two channels as a single channel, is first configured as a closed-loop system. This closed-loop module offers a capability to reduce the quantisation errors of the spatial parameters, leading to an improved quality of the synthesised audio signals. Moreover, a sub-optimal AbS optimisation, based on the closed-loop R-OTT module, is proposed. This algorithm addresses a problem of practicality in implementing an optimal AbS optimisation while it is still capable of improving further the quality of the reconstructed audio signals. In terms of algorithm complexity, the proposed sub-optimal algorithm provides scalability. The results of objective and subjective tests are presented. It is shown that significant improvement of the objective performance, when compared to the conventional open-loop approach, is achieved. On the other hand, subjective test show that the proposed technique achieves higher subjective difference grade scores than the tested advanced audio coding multichannel

    Multi-view video coding via virtual view generation

    Get PDF
    In this paper, a multi-view video coding method via generation of virtual picture sequences is proposed. Pictures are synthesized for the sake of better exploitation of the redundancies between neighbouring views in a multi-view sequence. Pictures are synthesized through a 3D warping method to estimate certain views in a multi-view set. Depth map and associated colour video sequences are used for view generation and tests. H. 264/AVC coding standard based MVC draft software is used for coding colour videos and depth maps as well as certain views which are predicted from the virtually generated views. Results for coding these views with the proposed method are compared against the reference H. 264/AVC simulcast method under some low delay coding scenarios. The rate-distortion performance of the proposed method outperforms that of the reference method at all bit-rates

    Towards an LTE hybrid unicast broadcast content delivery framework

    Get PDF
    The era of ubiquitous access to a rich selection of interactive and high quality multimedia has begun; with it, significant challenges in data demand have been placed on mobile network technologies. Content creators and broadcasters alike have embraced the additional capabilities offered by network delivery; diversifying content offerings and providing viewers with far greater choice. Mobile broadcast services introduced as part of the Long Term Evolution (LTE) standard, that are to be further enhanced with the release of 5G, do aid in spectrally efficient delivery of popular live multimedia to many mobile devices, but, ultimately rely on all users expressing interest in the same single stream. The research presented herein explores the development of a standards aligned, multi-stream aware framework; allowing mobile network operators the efficiency gains of broadcast whilst continuing to offer personalised experiences to subscribers. An open source, system level simulation platform is extended to support broadcast, characterised and validated. This is followed by the implementation of a Hybrid Unicast Broadcast Synchronisation (HUBS) framework able to dynamically vary broadcast resource allocation. The HUBS framework is then further expanded to make use of scalable video content

    Adaptive delivery of immersive 3D multi-view video over the Internet

    Get PDF
    The increase in Internet bandwidth and the developments in 3D video technology have paved the way for the delivery of 3D Multi-View Video (MVV) over the Internet. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D video experience may well be degraded unless content-aware precautionary mechanisms and adaptation methods are deployed. In this work, a novel adaptive MVV streaming method is introduced which addresses the future generation 3D immersive MVV experiences with multi-view displays. When the user experiences network congestion, making it necessary to perform adaptation, the rate-distortion optimum set of views that are pre-determined by the server, are truncated from the delivered MVV streams. In order to maintain high Quality of Experience (QoE) service during the frequent network congestion, the proposed method involves the calculation of low-overhead additional metadata that is delivered to the client. The proposed adaptive 3D MVV streaming solution is tested using the MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard. Both extensive objective and subjective evaluations are presented, showing that the proposed method provides significant quality enhancement under the adverse network conditions

    Towards adaptive control in smart homes: Overall system design and initial evaluation of activity recognition

    Get PDF
    This paper proposes an approach for adaptive control over devices within a smart home, by learning user behavior and preferences over time. The proposed solution leverages three components: activity recognition for realising the state of a user, ontologies for finding relevant devices within a smart home, and machine learning for decision making. In this paper, the focus is on the first component. Existing algorithms for activity recognition are systematically evaluated on a real-world dataset. A thorough analysis of the algorithms’ accuracy is presented, with focus on the structure of the selected dataset. Finally, further study of the dataset is carried out, aiming at reasoning factors that influence the activity recognition performance

    No-reference depth map quality evaluation model based on depth map edge confidence measurement in immersive video applications

    Get PDF
    When it comes to evaluating perceptual quality of digital media for overall quality of experience assessment in immersive video applications, typically two main approaches stand out: Subjective and objective quality evaluation. On one hand, subjective quality evaluation offers the best representation of perceived video quality assessed by the real viewers. On the other hand, it consumes a significant amount of time and effort, due to the involvement of real users with lengthy and laborious assessment procedures. Thus, it is essential that an objective quality evaluation model is developed. The speed-up advantage offered by an objective quality evaluation model, which can predict the quality of rendered virtual views based on the depth maps used in the rendering process, allows for faster quality assessments for immersive video applications. This is particularly important given the lack of a suitable reference or ground truth for comparing the available depth maps, especially when live content services are offered in those applications. This paper presents a no-reference depth map quality evaluation model based on a proposed depth map edge confidence measurement technique to assist with accurately estimating the quality of rendered (virtual) views in immersive multi-view video content. The model is applied for depth image-based rendering in multi-view video format, providing comparable evaluation results to those existing in the literature, and often exceeding their performance

    Predicting head trajectories in 360° virtual reality videos

    Get PDF
    In this paper a fixation prediction based saliency algorithm is used in order to predict the head movements of viewers watching virtual reality (VR) videos, by modelling the relationship between fixation predictions and recorded head movements. The saliency algorithm is applied to viewings faithfully recreated from recorded head movements. Spherical cross-correlation analysis is performed between predicted attention centres and actual viewing centres in order to try and identify prevalent lengths of predictable attention and how early they can be predicted. The results show that fixation prediction based saliency analysis correlates with head movements only for limited durations. Therefore, further classification of durations where saliency analysis is predictive is required

    Modeling user perception of 3D video based on ambient illumination context for enhanced user centric media access and consumption

    Get PDF
    For enjoying 3D video to its full extent, it is imperative that access and consumption of it is user centric, which in turn ensures improved 3D video perception. Several important factors including video characteristics, users’ preferences, contexts prevailing in various usage environments, etc have influences on 3D video perception. Thus, to assist efficient provision of user centric media, user perception of 3D video should be modeled considering the factors affecting perception. Considering ambient illumination context to model 3D video perception is an interesting research topic, which has not been particularly investigated in literature. This context is taken into account while modeling video quality and depth perception of 3D video in this paper. For the video quality perception model: motion and structural feature characteristics of color texture sequences; and for the depth perception model: luminance contrast of color texture and depth intensity of depth map sequences of 3D video are used as primary content related factors in the paper. Results derived using the video quality and depth perception models demonstrate that these models can efficiently predict user perception of 3D video considering the ambient illumination context in user centric media access and consumption environments
    • …
    corecore